home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
MACD 5
/
MACD 5.bin
/
workbench
/
docs
/
asm_guide
/
assembler course
/
doctext-assembler
< prev
next >
Wrap
Text File
|
1992-04-27
|
25KB
|
656 lines
INTRODUCTION
------------
If you ever read a book about assembler, You probably were
chased by weird things like carryflag, status register, Program
counter etc... Maybe you've read about 'Arithmatic & logical unit',
about 'CPU' and 'STACK'.
If so, just put it all aside for a while. Most things are very
important for the understanding of a computer's way of working,
but they have very little to do with coding itself. Most things
will become clear while we're working,...
If you have never coded anything in your life, you should seriously
consider something easier to start with, like BASIC. It is rather
important to know a bit about programming techniques, like loops,
structured code, subroutines... before you start with assembler.
Assembler is a second gerneration language, this means that there
are not much tools to make life easy for the programmer. For
example there's no 'print' instruction in assembler. This doesn't
make the problem easier for a total beginner...
A FIRST PROGRAM
---------------
Here follows a first VERY easy program. We'll use this to build
up our knowledge of assembler. Ofcourse, before you start making
demos, you have to know Assembler. Don't worry if you don't get
it yet, this will change very rapidly. Here we go:
top: move.b #5,d0
line1: tst.b d0
beq end
sub.b #1,d0
bra line1
end: rts
GENERAL THINGS - VERY IMPORTANT
-------------- - - - - - - -
** 1 In assembler, we mostly use Hexadecimal or binary numbers rather
than the normal decimals. This is kinda confusing in the beginning,
coz we're used to normal numbers, but in fact it's easy: let's try.
For example the number 423, let's have a look... Think of the meaning
of the position of the digits: the '4' in 423 means in fact 4x100
(= 4x 10^2), the 2 means 2x10 (=2x 10^1), the 3 means 3x1 (=3x 10^0)
as you see, each position represents another power of 10. Each digit
can have a value from 0 to 9. In binary notation, we have powers of
2, and each digit can have a value from 0 to 1. 100101 means in fact
1x2^5, 0x2^4, 0x2^3, 1x2^2, 0x2^1, 1x2^0. The same thing can be
said about 'hexadecimal' numbers: they use powers of 16, each digit
can have a value from 0 to 15. (0,1,...9,a,b,c,d,e,f)
You can count for yourself: decimal 10 = hexadecimal a
decimal 16 = hexadecimal 10 (1 x 16^1)
decimal 2 = binary 10 (1 x 2^1)
decimal 4 = binary 100 (1 x 2^2)
decimal 5 = binary 101 (1x2^2 + 1x2^0)
This is ofcourse nice to know, but you can hardly sit donw and count
each value with powers-of-i-dunno-what, coz it might take HOURS !!!!
Therefor, we have - luckyly - the ASSEMBLER !! If you use asmone
(you probably do), you can easyly transfer any number from binary
to hexadecimal to decimal, or back. You just tell asmone what kind of
number you have for him, and he'll do the rest for you. For each
special kind there is a special 'MARKER':
BINARY NUMBERS have a '%' preceding: %1001101011
HEXADECIMAL NUMBERS use a '$' before them: $38a83b
DECIMAL NUMBERS have nothing: 12353
$10 is something completely different than %10 or 10 !!!
Later, when you make a program, you can enter any number in the
notation you want: just put the correct prefix. You'll soon
discover that in some cases, hexadecimal notation is prefered, in
other cases binary notation might be more interesting... It depends.
** 2 Now about BYTES, WORDS and LONGWORDS.
A computer's memory is built up of enourmous amounts of bits. These
are in fact switches, can have 2 states: on or off, 1 or 0. To make
it all a bit easier to handle, these bits are grouped: if you group
8 bits, you have a BYTE. 16 bits (2 bytes) are called a WORD, and
32 bits (2 words, 4 bytes) are called a LONGWORD.
You can do things with these groups of bits, like putting a certain
value into them. It's obvious you can move larger numbers in a
LONGWORD than in a word or in a byte.
Grouping all these bits into bytes, you get the memory, where each
byte has it's own number-in-the-row. This number is the 'addres' of
this byte. Address 1493 is in fact the 1493th byte in the memory.
(Addresses are mostly expressed as HEXADECIMAL numbers, like $fc0000)
Each address contains a value, made up by those bits. For example
address number $10248 could contain the value 100 (just a silly example)
In hexadecimal notation, the value of one byte in memory can be
between #$00 and #$ff. (decimal: between #0 and #255)
If you wish to take a word in memory, you just take 2 succesive
bytes from memory (for example $10000 & $10001). We say in this case
that the word is at $10000 (and not $10001) WORDS CAN ONLY START ON
AN EVEN ADDRESS (so a word can not be for example on $10001 & $10002)
If address $10000 contains the byte #$34 and address $10001 contains
the byte #$a8, the WORD at $10000 contains #$34a8.
Longwords have the same story: only at EVEN ADDRRESSES, but here you
take 4 bytes in a row. Bytes #$10, #$3a, #$29, #$00 make the LONGWORD
#$103a2900. You see, to switch between bytes, words and longwords,
the hexadecimal notation is much easier than the decimal.
If data in memory is used as WORD or BYTE or LONGWORD is not
predefined. It just depends on how you wish to use it.
** 3 Now about the DATA- AND ADDRES-REGISTERS.
These are zones in the processor where you can temporary store
data. In the amiga, there are 8 data and 8 addressregisters:
d0, d1,... d7 and a0,a1...a7. They're used VERY OFTEN. The
special thing about it, and this is in fact an aspect of
the hardware, is that they are directly connected to the PROCESSOR,
where the memory is not. Therefor, the registers can be accessed
much faster than memory. REGISTERS are also used to store the
data that will be used for a mathematical calculation, and once this
calculation is done, the results will be in the registers again.
You can't for example subtract the value in address x from the
value in address y, and put the results in address z, no, you must
put the values in the registers first, then perform the subtr, and
then you'll get the results in the same register. Examples follow
in some lines. I've to tell you just one thing before that.
** 4 1) move.l #$1000,d0
2) move.l $1000,d0
Do you see the difference between these 2 lines? In line 1 is a
'#' before the number.
LINE 1 MEANS: PUT THE VALUE '$1000' IN DATAREGISTER D0
LINE 2 MEANS: PUT THE CONTENTS OF ADDRESS $1000 IN DATAREGISTER D0.
This is ofcourse a big difference.
If addres $1000 contained the value #$129475, in the second case,
this value would be moved into D0.
VALUES ALLWAYS HAVE A '#' PRECEDING. Addresses have nothing.
*******
Now take back the small program. See the 'MOVE.B #5,d0' ?
You ought to understand everything of it now:
The '.B' means that you're gonna work with only 8 bits in one time.
(BYTE)
The # means that a VALUE is about to follow: the value is #5.
There's no $ or % before the number, which means that the number
is decimal. (note: decimal 5 and hexadecimal 5 are the same...)
D0 is a dataregister in which the number will be moved.
So: move.b #5,d0 means: move decimal value 5 into dataregister 0.
The size of the moved value is 8 bits.
(note: one register contains 32 bits)
MOVE.L #$1000,d1
this means: move the hexadecimal value 1000 in datareg1.
here you transfer a LONGWORD (32 bits)
MOVE.W $2000,d1
now you move the WORD that is stored at address $2000
into datareg1. Let's say that the memory looks as follows:
addr: $1ffe $1fff $2000 $2001 $2002 $2003 $2004
value: #$a4 #$4d #$00 #$48 #$29 #$00 #$35
The value moved to D0 will be in that case: #$0048
(1 word = 2 bytes, starting from $2000)
If we moved a longword (move.l $2000,d0) the value would
be #$00482900
I think the other lines are not to difficult.
Now you know very much already. But there's much more to come...
You see that we moved the value to a dataregister: D0. That's
not necessary. We could also move it to an addressreg, or to an
address in memory. These are various 'ADDRESSING METHODS', and if
you use the addressing methods in a clever way, your program can
be much faster or better... Which are these addressing methods ?
Let's explain it with an example for each one:
( '.x' means it can be anything (byte, word or longword)
#x means a value like #400, #$ffa0 or #%10010101
addr means an address like $c0000
Dx means 'any dataregister' (d0 - d7)
Ax means 'any addressregister' (a0 - a7)
)
NORMAL ADDRESSINGS (most commonly used, don't know their name)
- - - - - - - - -
MOVE.x #x,addr : move a value to an addres
MOVE.x addr,addr : move the contents of an address to another addres
MOVE.x #x,Dx : move a value x to a dataregister
MOVE.x addr,Dx : move the contents of an address to a dataregister
MOVE.x Dx,addr : move the contents of a datareg into an addres
MOVE.x #x,Ax : move a value to an addressregister
since addresses in the AMIGA are 32 bits long,
most of these kind of moves are LONG (Move.L)
MOVE.x addr,Ax: move the contents of an address to an addressreg.
also most times a longword.
MOVE.x Ax,addr: move the contents of an addrreg into an addres
MOVE.x Dx,Dy : move from one datareg into another one
MOVE.x Ax,Dy or from an addressreg into a datareg
MOVE.x Dx,Ay or any other combination you can think of
...
INDIRECT ADDRESSING:
- - - - - - - - - -
MOVE.x #x,(Ax)
THIS IS VERY INTERESTING !!! now you move the value 'x' not into
the addressregister, (like in 'MOVE.x #x,Ax' ) but into the addres
that is stored in this addressregister.
For example if A0 is currently filled with the VALUE #$fc0000,
MOVE.B #4,(A0)
would cause the ADDRESS $fc0000 to be filled with the value #4.
(This would have the same effect as MOVE.B #4,$fc0000)
Now you see why A-registers are called ADDRESSregisters. The values
that are stored in an ADDRESSREGISTER represent ADDRESSES.
Indirect addressing can only be done with addressregisters,
so MOVE.x #x,(Dx) isn't allowed. Values stored in a DATAREGISTER
represent VALUES only.
All combinations are allowed:
move.x (Ax),Dy
move.x (Ax),addr
move.x (Ax),(Ay)
move.x Dx,(Ay)
...
You can also put an OFFSET with the (Ax). If A0 contains $10000
and you wanna put something in $10004 (which accidently is A0 + #$4)
you just : MOVE.x #x,4(A0)
- watch the notation: hexadecimal for example $a(A0)
Don't put a '#' (wrong: #$a(A0) )
- the offset is limited (I dunno exactly how big)
this would be too large, I suppose: $102447(A0)
but this is still OK: $200(A0)
INDIRECT ADDRESSING WITH POSTINCREMENT:
- - - - - - - - - - - - - - - - - - - -
example: MOVE.B #0,(A0)+
That's another VERY INTERESTING way of addressing. Let's say you
wish to fill a row of addresses starting from $10000 with value 0.
Then you put this address in an addressregister, and you use indirect
addressing WITH 'POSTINCREMENT'. AFTER THE INSTRUCTION (in this
case MOVE), the addressregister will be increased automaticly, so
that it points to the next byte, word or longword (depending on the
size: move.b .w or .l)
let's say A0 contains #$10000
we do a MOVE.L #$2da30,(A0)+
memory will look like this:
$10000 $10001 $10002 $10003
#$00 #$02 #$da #$30
and the value in A0 will be #$10004.
You can't put an offset at the addressreg: WRONG: MOVE.x #x,4(A0)+
Almost the same thing is: IND.ADDR. WITH PREDECREMENT
- - - - - - - - - - - - - -
example: MOVE.B #0,-(A0)
Here, the contents of A0 will be decreased to the first lower
byte (or word or longword, depending on the size again), and THEN
the move will be done.
So, if A0 contains #$5c000:
MOVE.W #$204a,-(A0)
will have the following effect:
A0 will be decreased with 2 (1 word = 2 bytes) making it #$4bffe
and then #$204a will be put in locations $4bffe & $4bfff
Don't mix up predecrement and negative offsets:
MOVE.x #x,-(A0) ; Ind.addressing with predecrement
MOVE.x #x,-4(A0) ; Ind.addressing with an offset '-4'
SPECIAL EXAMPLE OF POSTINC/PREDECR
- - - - - - - - - - - - - - - - -
MOVEM.L D0-D7/A0-A6,-(A7)
MOVEM.L (A7)+,D0-D7/A0-A6
these are 2 special instructions, as you see they
use postincr and predecr. addressing.
MOVEM means 'move multiple'. You decrease the value
in A7, then move D0 to that address, again decrease
A7, and move D1 in this addres, and so on.
The second instruction gets all these values back
from (A7) and puts them in the registers.
This way you are able to save the contents of all
the registers, and get them back after for example
you performed a subroutine (where you changed them)
A7 is what is called the STACKPOINTER, in it is the
address of the 'STACK', a place where important
values are stored as 'FIRST IN LAST OUT'. If you
put D0 in it, then D1, then D2, the first value
that you'll get back is D2, then D1 and then D0.
This is also used for jumping to subroutines:
main: bsr routine
...
routine:....
...
rts
The BSR will make the computer save the current address on the stack,
and when the routine is finished (RTS), this address will be taken
back from the stack, so the program can continue from where it left.
(this is done automatically, Don't worry)
ABOUT RELATIVE ADRESSING
------------------------
You surely know that AMIGA is a multitasking computer. That means
that you can run more programs at 1 time, and so have more than 1
program in memory at 1 time. If you load a program, it's never sure
where this program will be. If you just loaded another program, there
will be no place for this second one on this place, so all depends !!
A program that is loaded, first tells amiga how much memory it will
need, amiga checks where he has some room, and he tells the program
the start of this room. A program can therefor never say for example
jump to address $10000, because he doesn't know if he will be
located there. The program instead says: jump to (starting location
+ offset):
$10000: starting location
...
$12000: routine
It would be something like : JMP (STARTING LOCATION + $2000)
BUT !!NOT!! : JMP $12000
because when the program gets loaded an other time, the starting
location could be for example $14000, putting the routine at $16000.
The line 'JMP starting location + $2000' however is still valid.
This has it's consequences when using asmone, or any other language on
AMIGA. You can for example not put a picture on address $70000,
(although many BAAAAAD coders do this) because you simply don't know
if there's room on that location. (don't get it wrong: you CAN do
it, but it's WRONG. The computer could crash)
LABELS!
- - - -
In asmone, addresses are given a name: a LABEL. This is very interesting.
You just give a routine a label, and if you wish to jump to it, you
say "JMP labelname" instead of JMP addres. (This would be against
the rules of relative addressing) When you assemble your source,
asmone will look where there's room to put it, and change the labels
into relative addresses. In fact YOU don't have to worry anymore.
The program we saw earlier contained labels too:
Top, Line1, and End are labels. asmone will calculate the correct
addresses for them when you assemble it.
In fact, you should replace each word 'addres' in this text with
the word 'label'. For example you must
MOVE.x #x,label instead of MOVE.x #x,addr
MOVE.x label,Dx " " MOVE.x addr,Dx
.... ....
In the assembling, each command will be translated into a number
which is the 'RAW' machinecode for this command. (for example:
jmp will get the number #$4ef9 ) This value is stored somewhere
in memory, at a certain location 'starting location + offset'
Values of successive commands will be put behind eachother in memory.
************
Now the time is right to again attract your attention to
'GENERAL THINGS #4' (read this part again please)
You should be able to tell me what the difference is between
MOVE.L LABEL,D0
MOVE.L #LABEL,D0
But to be sure, I'll tell you: in the first case, the longword-value
that is in addresses LABEL, LABEL+1, LABEL+2 and LABEL+3, will be
put in D0. In the second case, the addressVALUE of LABEL will be put
in D0. We don't know this address until it is assembled.
some examples on this:
* program: move.b data,D0 ; the CONTENTS of 'data'
rts
data: dc.b 10
Now, D0 will contain the byte stored at address 'data' (#10)
* program: move.l #data,A0 ; the addresVALUE of 'data'
move.b (A0)+,D0
move.b (A0),D1
rts
data: dc.b 10,11
first we moved the address 'data' to A0. Then we moved a byte stored
at (A0) to D0, A0 was increased by one BYTE, and we stored the byte
at (A0) to D1. D0 will contain #10, and D1 will contain #11
Please note: if you put an address into an addresregister, like in
the last example, you must use LONGWORD move (MOVE.L) because each
address is 32 bits long. If you did 'MOVE.B' or 'MOVE.W' you would
only have moved a part of the addres, this is not forbidden, but it
was not your intention: if the addres of 'data' was $00073a00, a
MOVE.W #data,A0 would have caused A0 to be filled with $3a00, which
is also an address, but not the addres you wanted !!!
REMARK
------
Dataregs like D0, and addressregs like A0, have a length of 32 bits,
in other words: LONGWORDS.
If you move a BYTE or a WORD into these registers, they won't be
filled up completely. The not filled part of the register isn't
changed... example:
D0 contained: #$12345678 (longword)
MOVE.B #$00,D0
now D0 contains: #$12345600
MOVE.W #$3333,D0
now D0 contains: #$12343333
MOVE.L #$3333,D0
now D0 contains: #$00003333
REMARK2
-------
Please refer back to General remarks #2. You'll see that a word
or a longword can only start at EVEN addresses. Now look at this
program:
start: move.w ....
..... ; other commands
.....
data: dc.b 0 ; 1 byte of data
routine:move..... ; again commands....
asmone takes care that the starting of this program is at an even
address, and because all commands are an even number of bytes long,
all other commands will be put on even addresses too. BUT, now we
put ONE byte at a certain location. The next commands will start
on an ODD address, which is forbidden. One simple mistake like
this could have caused a 'system crash' if you didn't have asmone.
asmone ofcourse notices this mistake and says: WORD AT ODD ADDRESS.
All you have to do is putting the command 'EVEN' behind the data
that caused the mistake:
data: dc.b 0
even
routine:....
It's best to put 'EVEN' behind each row of DC.B's
This is a similar mistake: (often made, often hard to find)
move.l #data,A0
move.l #otherdata,A1
move.b (A0)+,(A1)+
move.w (A0)+,(A1)+ ; **** wrong !!
...
data: dc.b 0,20,12,23,....
even
otherdata:
dc.b 0,0,0,0,....
even
This program first puts the addresses of 'data' and 'otherdata' in
2 addressregs. DATA is at an even addres, correct. Then it moves
a BYTE from the data-row into the 'otherdata' row. The increment
from MOVE.B (A0)+,(A1)+ will add 1 to the even values in A0 and A1.
Next we want to move a WORD at (A0) to a WORD at (A1), and you guessed
right: it won't work... because they are now ODD, and a word can only
be at an EVEN addres. another MOVE.B (A0)+,(A1)+ would cause no
problem.
SOME MORE COMMANDS
------------------
till now, we've only seen instruction MOVE. Here are some other ones.
You can use most of the addressing methods on these commands. I don't
know exactly which are allowed and which not, but you won't use
most of them after all, and if you accidentally use one that isn't
allowed, asmone will tell you, and it's just as soon corrected.
So here we go:
ADD.x a,b add a to b, result comes in b.
(a and b can be #x, label, Ax, Dx, x(Ax), (Ax)+,...)
SUB.x a,b subtract a from b, result in b.
(idem)
CMP.x a,b compare a with b, a & b can be anything except
postincement or predecrement (Ax)+, -(Ax)
TST.x a compare a with zero. a can't be an addresregister
BSR label : branch to a subroutine. You'll get back with 'RTS'
BRA label : perform an unconditional branch to another location
in the program. You cannot return with RTS
BNE label : branch if not equal (after a CMP or TST)
BEQ label : branch if equal
BLT, BLE, BGT, BGE: branch if less than, less or equal, greater than,
greater or equal.
SWAP Dx : swap the contents of the lower word and the higher
word of a dataregister:
D0: $xxxx yyyy
swap d0
D0: $yyyy xxxx
moving addresses to an addresregister can be done with
MOVE.L #label,Ax
but there's a special command to do it, it's a bit faster:
LEA.L label,Ax
notice that there's no # anymore, only for addr.regs !!
NOT SO OFTEN USED COMMANDS
- - - - - - - - - - - - -
MULU a,b multiply a with b (this is a very slow command!)
DIVU a,b divide b by a (idem, avoid using them!)
OR.x a,b perform a logical OR with a on b
AND.x a,b " " AND with a on b
EOR.x a,b " " EOR with a on b
NOT.x a perform NOT on a
a 100101011
b 001100110
---------
a and b:000100010 the result bit will only be set if
bots bits of a AND b were set
or is the same but the result bit will only be set if the
bit in a OR the bit in b was set:
a or b: 101101111
xor is EXCLUSIVE OR. only if a bit is set in a AND NOT in b,
or set in b and not in a, the result will be 1
a eor b:101001101
not moves 1 to 0 and 0 to 1: not (001010)= 110101
BTST #x,Dx : check if a bit is equal to zero (in a dataregister)
D0: 10110010 01010111 10010100 00011011
^ ^ ^
bit 31 15 0
BTST #15,D0 -> not equal !! bit 15 is set !!
BCLR #x,Dx : clear a bit in a datareg
BSET #x,Dx : set a bit in a datareg
ASL.x #x,D/Ax : shift the bits in Dx or Ax x times to the left
if the leftmost was 1, carry will be set
ASR.x #x,D/Ax : shift to the right, if rightmost bit was 1, carry
will be set, else, carry wil be cleared
BCS, BCC : branch if carry set/clear
I think these are all the commands you'll use. A complete list with
full details will follow soon, but you don't need it yet. It's full
with numbers and symbols, it would just make it unoverviewable.
Please refer to the sources to see some real examples & experiments...
asmone COMMANDS
-------------
You alreqady knew it, but I say it anyway: asmone has 2 states:
EDITOR STATE and ONLINE STATE...
If you start Asmone, you get a 'PROMPT', it look like: 'ASM1>'
You can switch between the editor and back by pressing <ESC>
[..] means that this is not necessary
<label> means you mustn't type 'label' but just a labelname
Here's a list of online-commands:
a - assemble the source
j[<label>] - jump (if you give a labelname, the program will
jump to the address which corresponds with
the label)
(you could also enter an address like $30000 instead
of any label)
l<string> - look for a string in the source
@d<label> - disassemble a part of your assembled source,
starting at <label> (press return to continue)
@h<label> - show memory in ascii & hex starting from a label
@m<label> - modify memory... (not sure)
e - load >extern files... see sources for examples
t - go to the top of the file
b - bottom
v[<dir>] - get the directory of the current directory
if you give a dir-name, the CD will change to that
directory & display the contents.
v <dir> - v+space: change the current directory but don't
show contents.
r - read a source (or show a requester to select one)
w - write source
wo - write the OBJECT (the assembled version, which is
executable) to disk
IN EDITOR MODE:
SHIFT UP/DWN- fast moving in source
AMI B - start a block (to cut/copy)
AMI C - copy block
AMI X - cut block
AMI I - insert block
SHIFT Fx - MARK A POSITION (1..3)
Fx - JUMP TO MARKED POSITION
For more detailed info, see asmone DOCUMENTATION...
********
I think this is enough for this time... If you've read all this,
you've seen nearly each aspect of assembler. I realise this is
a whole bunch of information, so take your time to understand it.
It's not easy, though it seems 'NORMAL' for someone who knows it.
If you've read all this, it's time you'd take a look at the sources,
where you can experiment and have a look at some results. The
copies will follow soon, but you ABSOLUTELY don't need them yet.
First get used to the the language and the characteristics of it.
I hope you could understand most of this bulshit, if you don't get
something, or you have questions, just ask me, and I'll try to
answer...
NO HAVE A LOOK AT THE SOURCES AND MAKE SOME YOURSELF !!! TRYING IS
THE BEST WAY TO LEARN !!!
SEE YOU SOON ! GREETINGS from
Geert / EIKENLAAN 21 / 3740 BILZEN / BELGIUM